15 research outputs found
Calibrating Segmentation Networks with Margin-based Label Smoothing
Despite the undeniable progress in visual recognition tasks fueled by deep
neural networks, there exists recent evidence showing that these models are
poorly calibrated, resulting in over-confident predictions. The standard
practices of minimizing the cross entropy loss during training promote the
predicted softmax probabilities to match the one-hot label assignments.
Nevertheless, this yields a pre-softmax activation of the correct class that is
significantly larger than the remaining activations, which exacerbates the
miscalibration problem. Recent observations from the classification literature
suggest that loss functions that embed implicit or explicit maximization of the
entropy of predictions yield state-of-the-art calibration performances. Despite
these findings, the impact of these losses in the relevant task of calibrating
medical image segmentation networks remains unexplored. In this work, we
provide a unifying constrained-optimization perspective of current
state-of-the-art calibration losses. Specifically, these losses could be viewed
as approximations of a linear penalty (or a Lagrangian term) imposing equality
constraints on logit distances. This points to an important limitation of such
underlying equality constraints, whose ensuing gradients constantly push
towards a non-informative solution, which might prevent from reaching the best
compromise between the discriminative performance and calibration of the model
during gradient-based optimization. Following our observations, we propose a
simple and flexible generalization based on inequality constraints, which
imposes a controllable margin on logit distances. Comprehensive experiments on
a variety of public medical image segmentation benchmarks demonstrate that our
method sets novel state-of-the-art results on these tasks in terms of network
calibration, whereas the discriminative performance is also improved.Comment: Under review. The code is available at
https://github.com/Bala93/MarginLoss. arXiv admin note: substantial text
overlap with arXiv:2111.1543
Prompting classes: Exploring the Power of Prompt Class Learning in Weakly Supervised Semantic Segmentation
Recently, CLIP-based approaches have exhibited remarkable performance on
generalization and few-shot learning tasks, fueled by the power of contrastive
language-vision pre-training. In particular, prompt tuning has emerged as an
effective strategy to adapt the pre-trained language-vision models to
downstream tasks by employing task-related textual tokens. Motivated by this
progress, in this work we question whether other fundamental problems, such as
weakly supervised semantic segmentation (WSSS), can benefit from prompt tuning.
Our findings reveal two interesting observations that shed light on the impact
of prompt tuning on WSSS. First, modifying only the class token of the text
prompt results in a greater impact on the Class Activation Map (CAM), compared
to arguably more complex strategies that optimize the context. And second, the
class token associated with the image ground truth does not necessarily
correspond to the category that yields the best CAM. Motivated by these
observations, we introduce a novel approach based on a PrOmpt cLass lEarning
(POLE) strategy. Through extensive experiments we demonstrate that our simple,
yet efficient approach achieves SOTA performance in a well-known WSSS
benchmark. These results highlight not only the benefits of language-vision
models in WSSS but also the potential of prompt learning for this problem. The
code is available at https://github.com/rB080/WSS_POLE.Comment: Under revie